AITopics

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)

Technology: Information Technology > Artificial Intelligence (0.48)

Neural Information Processing SystemsFeb-9-2026, 15:45:06 GMT

6cd3ac24cdb789beeaa9f7145670fcae-Paper-Conference.pdf

language recognition, recognition, sign language recognition, (15 more...)

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)

Neural Information Processing SystemsFeb-8-2026, 00:44:03 GMT

GenerativeViewSynthesis: FromSingle-view SemanticstoNovel-viewImages

Notsurprisingly, GVS also inherits the challenges of both: generating RGB values from a bare semantic map is an ill-posed problem that is further complicated by the need for the different output views to be photometrically and geometrically consistent. One could tackle this problem with the sequential application ofexisting techniques.

artificial intelligence, machine learning, representation, (16 more...)

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.47)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Neural Information Processing SystemsAug-15-2025, 15:54:30 GMT

6cd3ac24cdb789beeaa9f7145670fcae-Paper-Conference.pdf

language recognition, recognition, sign language recognition, (15 more...)

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)

Brioso, Ricardo Coimbra, Crespi, Leonardo, Seghetto, Andrea, Dei, Damiano, Lambri, Nicola, Mancosu, Pietro, Scorsetti, Marta, Loiacono, Daniele

ARTInp: CBCT-to-CT Image Inpainting and Image Translation in Radiotherapy

arXiv.org Artificial IntelligenceFeb-7-2025

A key step in Adaptive Radiation Therapy (ART) workflows is the evaluation of the patient's anatomy at treatment time to ensure the accuracy of the delivery. To this end, Cone Beam Computerized Tomography (CBCT) is widely used being cost-effective and easy to integrate into the treatment process. Nonetheless, CBCT images have lower resolution and more artifacts than CT scans, making them less reliable for precise treatment validation. Moreover, in complex treatments such as Total Marrow and Lymph Node Irradiation (TMLI), where full-body visualization of the patient is critical for accurate dose delivery, the CBCT images are often discontinuous, leaving gaps that could contain relevant anatomical information. To address these limitations, we propose ARTInp (Adaptive Radiation Therapy Inpainting), a novel deep-learning framework combining image inpainting and CBCT-to-CT translation. ARTInp employs a dual-network approach: a completion network that fills anatomical gaps in CBCT volumes and a custom Generative Adversarial Network (GAN) to generate high-quality synthetic CT (sCT) images. We trained ARTInp on a dataset of paired CBCT and CT images from the SynthRad 2023 challenge, and the performance achieved on a test set of 18 patients demonstrates its potential for enhancing CBCT-based workflows in radiotherapy.

artificial intelligence, machine learning, translation network, (18 more...)

2502.04898

Country:

Europe > Italy > Lombardy > Milan (0.04)
Europe > Portugal > Coimbra > Coimbra (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Neural Information Processing SystemsJan-26-2025, 18:24:54 GMT

Review for NeurIPS paper: Practical No-box Adversarial Attacks against DNNs

I wouldn't say this is a weakness but it would be good to have some works on adversarial attacks on auto-encoders and image translation networks cited in the paper. All of these works have the defining characteristic of attacking networks that have an image as an input and and image as an output and that the attacks are adapted from attacks such as FGSM, I-FGSM and PGD. This is related to the idea of the attack presented, since it is generated on an auto-encoder and transferred to a target model. None of these mentioned works diminish the novelty of the paper's ideas, they are just related. But that is just me.

neurips paper, practical no-box adversarial attack, translation network, (2 more...)

Industry:

Information Technology > Security & Privacy (0.96)
Government > Military (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.83)

Li, Haotian, Siebes, Arno, Mehrkanoon, Siamak

Self-supervised Spatial-Temporal Learner for Precipitation Nowcasting

arXiv.org Artificial IntelligenceDec-20-2024

Nowcasting, the short-term prediction of weather, is essential for making timely and weather-dependent decisions. Specifically, precipitation nowcasting aims to predict precipitation at a local level within a 6-hour time frame. This task can be framed as a spatial-temporal sequence forecasting problem, where deep learning methods have been particularly effective. However, despite advancements in self-supervised learning, most successful methods for nowcasting remain fully supervised. Self-supervised learning is advantageous for pretraining models to learn representations without requiring extensive labeled data. In this work, we leverage the benefits of self-supervised learning and integrate it with spatial-temporal learning to develop a novel model, SpaT-SparK. SpaT-SparK comprises a CNN-based encoder-decoder structure pretrained with a masked image modeling (MIM) task and a translation network that captures temporal relationships among past and future precipitation maps in downstream tasks. We conducted experiments on the NL-50 dataset to evaluate the performance of SpaT-SparK. The results demonstrate that SpaT-SparK outperforms existing baseline supervised models, such as SmaAt-UNet, providing more accurate nowcasting predictions.

artificial intelligence, machine learning, precipitation, (16 more...)

2412.15917

Country:

Europe > Netherlands (0.05)
Europe > Belgium (0.04)

Genre:

Research Report > New Finding (0.49)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Pandey, Ananya, Vishwakarma, Dinesh Kumar

Target-Dependent Multimodal Sentiment Analysis Via Employing Visual-to Emotional-Caption Translation Network using Visual-Caption Pairs

arXiv.org Artificial IntelligenceAug-5-2024

The natural language processing and multimedia field has seen a notable surge in interest in multimodal sentiment recognition. Hence, this study aims to employ Target-Dependent Multimodal Sentiment Analysis (TDMSA) to identify the level of sentiment associated with every target (aspect) stated within a multimodal post consisting of a visual-caption pair. Despite the recent advancements in multimodal sentiment recognition, there has been a lack of explicit incorporation of emotional clues from the visual modality, specifically those pertaining to facial expressions. The challenge at hand is to proficiently obtain visual and emotional clues and subsequently synchronise them with the textual content. In light of this fact, this study presents a novel approach called the Visual-to-Emotional-Caption Translation Network (VECTN) technique. The primary objective of this strategy is to effectively acquire visual sentiment clues by analysing facial expressions. Additionally, it effectively aligns and blends the obtained emotional clues with the target attribute of the caption mode. The experimental findings demonstrate that our methodology is capable of producing ground-breaking outcomes when applied to two publicly accessible multimodal Twitter datasets, namely, Twitter-2015 and Twitter-2017. The experimental results show that the suggested model achieves an accuracy of 81.23% and a macro-F1 of 80.61% on the Twitter-15 dataset, while 77.42% and 75.19% on the Twitter-17 dataset, respectively. The observed improvement in performance reveals that our model is better than others when it comes to collecting target-level sentiment in multimodal data using the expressions of the face.

face description, information, sentiment analysis, (12 more...)

2408.10248

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > India (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(9 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(3 more...)

Medeiros, Heitor Rapela, Aminbeidokhti, Masih, Pena, Fidel Guerrero, Latortue, David, Granger, Eric, Pedersoli, Marco

Modality Translation for Object Detection Adaptation Without Forgetting Prior Knowledge

arXiv.org Artificial IntelligenceApr-11-2024

A common practice in deep learning consists of training large neural networks on massive datasets to perform accurately for different domains and tasks. While this methodology may work well in numerous application areas, it only applies across modalities due to a larger distribution shift in data captured using different sensors. This paper focuses on the problem of adapting a large object detection model to one or multiple modalities while being efficient. To do so, we propose ModTr as an alternative to the common approach of fine-tuning large models. ModTr consists of adapting the input with a small transformation network trained to minimize the detection loss directly. The original model can therefore work on the translated inputs without any further change or fine-tuning to its parameters. Experimental results on translating from IR to RGB images on two well-known datasets show that this simple ModTr approach provides detectors that can perform comparably or better than the standard fine-tuning without forgetting the original knowledge. This opens the doors to a more flexible and efficient service-based detection pipeline in which, instead of using a different detector for each modality, a unique and unaltered server is constantly running, where multiple modalities with the corresponding translations can query it.

dataset, detector, modality, (16 more...)